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Classification  of  Parallel  Database  Systems 


•  Shared  Memory  (SM);  also  called  Shared  Everything  (SE) 
.  Shared  Disk  (SD) 

•  Shared  Nothing  (SN) 
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Shared-Memory  (SM)  Parallel  Database  Systems 


•  All  processors  directly  access,  in  a  symmetrical  fashion,  all  main  memory 
and  disks. 

•  In  general,  the  operating  system  allocates  processors  to  processes. 

•  Processors  have  local  caches  to  reduce  network  traffic,  but 
loading/flushing  caches  can  degrade  performance, 

•  Hardware-specific  solutions  are  required  to  ensure  coherence  among  the 
caches  (e.  g.  processors  continuously  snoop  the  shared  bus  to  see  if  their 
cached  data  is  required  elsewhere.) 

•  Typical  Hardware:  IBM  3090,  IBM  370,  Bull  DBS8,  Encore,  Sequent 

Symmetry 
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Shared-Disk  (SD)  Parallel  Database  Systems 


•  Each  node  has  its  own  memory  and  may  itself  be  an  SMP  box. 

•  The  nodes  share  the  same  disks  logically  (and  may  be  physically  too). 

•  System  (or  application)  software  must  ensure  the  coherency  among 
multiple  copies  of  disk  pages  requested  by  different  nodes. 

•  A  query  or  update  by  a  node  requires  it  to: 

1.  Transmit,  to  all  other  nodes,  an  intention  to  query/update  the  database. 

2.  If  the  required  page  is  currently  being  updated  by  any  other  node,  then 
wait  until  it  is  released. 

3.  Read  or  receive  the  required  page. 

4.  perform  the  query  or  update. 

•  Typical  Hardware:  DEC  VM  Cluster,  SUN  Sparc  1000  cluster 
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Shared-Nothing  (SN)  Parallel  Database  Systems 


•  Each  node  has  its  own  local  memory  and  its  own  local  disks. 

•  The  database  is  partitioned  across  the  nodes,  thus  allowing  I/O 
parallelization 

•  Each  node  acts  as  a  server  for  its  local  data. 

•  An  update  by  a  processor  requires: 

1 .  Transmit  a  request  for  update  to  the  relevant  server. 

2.  The  server  performs  the  update,  locally. 

3.  The  server  acknowledges  the  success  back  to  the  requester. 

•  A  query  by  a  processor  requires: 

1 .  Transmit  the  query  to  the  relevant  server. 

2.  The  server  performs  the  query  locally. 

3.  The  server  sends  the  query  result  to  the  requester. 

•  Typical  Hardware:  AT&T  3600,  IBM  SP2,  nCUBE,  VAXcluster 
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Metrics  for  Evaluating  the  Three  Architectures 


Metric 

Explanation 

1 

Price 

Using  commodity  hardware  reduces  system  cost. 

2 

Throughput 

Inter-query  parallelism  improves  throughput. 

3 

Response  Time 

Intra-query  parallelism  improves  response  time. 

4 

Speedup 

Ideally,  twice  the  hardware  should  solve  the 
probleni  in  half  the  time. 

5 

Scaleup 

Ideally,  twice  the  hardware  should  solve  twice 
the  problem  in  the  same  time. 

■ 

Startup  Cost 

Preparing  a  query  for  parallel  execution  is  an 
overhead. 

■ 

Interference 

Processors  slow  each  other  when  competing  for 
shared  resources. 

■ 

Load  Balancing 

Ideally,  all  the  processors  should  be  working 
concurrently. 

9 

Comm. 

Overheads 

Ideally,  sub-problems  of  one  problem  should 
require  least  communication. 

10 

Data 

Availability 

It  is  desirable  to  be  able  to  tolerate  failure  of 
some  nodes. 

11 

Portability 

Porting  centralized  DBMS  software  should  be 
relatively  easy. 
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